Senses of Polysemous Nouns: Building a Computational Lexicon of Basic Japanese Nouns
نویسندگان
چکیده
We have constructed the IPA Lexicon of Basic Japanese Nouns (IPAL-BN), which has a hierarchical structure based on the syntactic and semantic properties of no,ms. In our lexicon, each lexical entry consists of subentries, and subentries have semantic property information. Among these elements, we focus here on the subentry description. Conventional Japanese dictionaries only enmnerate various usages. But it is also important to clarify the semantic relations between subentries. Thus we have developed a method for specifying the kind of relationship between subentries, using special cognitiw~ devices such as metaphor, metonymy, and synecdoche. After a brief review of the structure of our lexicon, we discuss how the nlethod can be applied to the lexical description. 1 I n t r o d u c t i o n Tile Information-technology Promotion Agency (IPA) i has compiled the IPA Lexicon of the Japanese Language for Computers, Basic Japanese Verbs (IPAL-BV) (1987) and Basic Japanese Adjectives (IPAL-BA) (1990). The IPAL-BV contains 861 verbs and the IPAL-BA contains 136 adjectives a~s lexical entries. These lexicons are available for public use and have been widely used in various mfiversity and research institute projects that have yielded encouraging results. We started work on the IPAL-BN project in 1990. hi May 1996, we released tlie third edition of tile IPAL-BN, with 1,081 nouns as lexical entries, for the public on networks with F T P service. The IPAL project is characterized by its linguistic basis. For example, the hierarchical structure i A special juridical body under the jurisdiction of the Ministry of International Tradc and Industry, Japmt. of the IPAL-BN, which consists of lexical entries, subentries, and semantic property information, reflects our linguistic considerations concerning the syntactic and semantic properties of nouns. Another example of benefits fi'om our linguistically inspired approach is the description of the kind of relationship between subentries. Such information would be useful in various applications, but is not yet explicitly provided in existing Japanese dictionaries. In the following sections, we first briefly introduce the general structure of the IPAL-BN, and then describe our method for specifying the kind of relationship between subentries. In the concluding renlarks, we also touch on implications of the nlethod for tile application systems. 2 S t r u c t u r e o f I P A L B N Figure 1 shows the top-level structure of the IPAL-BN. The H'AL-BN consists of 1,081 lexical entries.
منابع مشابه
Polysemy Index for Nouns: an Experiment on Italian using the PAROLE SIMPLE CLIPS Lexical Database
An experiment is presented to induce a set of polysemous basic type alternations (such as ANIMAL-FOOD, or BUILDING-INSTITUTION) by deriving them from the sense alternations found in an existing lexical resource. The paper builds on previous work and applies those results to the Italian lexicon PAROLE SIMPLE CLIPS. The new results show how the set of frequent type alternations that can be induce...
متن کاملA Word-Embedding-based Sense Index for Regular Polysemy Representation
We present a method for the detection and representation of polysemous nouns, a phenomenon that has received little attention in NLP. The method is based on the exploitation of the semantic information preserved in Word Embeddings. We first prove that polysemous nouns instantiating a particular sense alternation form a separate class when clustering nouns in a lexicon. Such a class, however, do...
متن کاملCarving up word meaning: Portioning and grinding
Two eye-tracking experiments investigated the processing of mass nouns used as count nouns and count nouns used as mass nouns. Following Copestake and Briscoe (1995), the basic or underived sense of a word was treated as the input to a derivational rule (“grinding” or “portioning”) which produced the derived sense as output. It was hypothesized that in the absence of biasing evidence readers wo...
متن کاملAn Analysis of Persian Compound Nouns as Constructions
In Construction Morphology (CM), a compound is treated as a construction at the word level with a systematic correlation between its form and meaning, in the sense that any change in the form is accompanied by a change in the meaning. Compound words are coined by compounding templates which are called abstract schemas in CM. These abstract constructional schemas generalize over sets of existing...
متن کاملHandling Subtle Sense Distinctions Through Wordnet Semantic Types
In this paper we challenge the question of whether there is value in having multiple layers of semantic information associated with corpus semantic annotation. In this context we introduce a semantic annotation experiment in which novice annotators were asked to assign sense tags to a set of polysemous corpus nouns, using Wordnet as their referential sense repository. Wordnet is a rich sense in...
متن کامل